No-regret Algorithms for Online Convex Programs

نویسنده

  • Geoffrey J. Gordon
چکیده

Online convex programming has recently emerged as a powerful primitive for designing machine learning algorithms. For example, OCP can be used for learning a linear classifier, dynamically rebalancing a binary search tree, finding the shortest path in a graph with unknown edge lengths, solving a structured classification problem, or finding a good strategy in an extensive-form game. Several researchers have designed no-regret algorithms for OCP. But, compared to algorithms for special cases of OCP such as learning from expert advice, these algorithms are not very numerous or flexible. In learning from expert advice, one tool which has proved particularly valuable is the correspondence between no-regret algorithms and convex potential functions: by reasoning about these potential functions, researchers have designed algorithms with a wide variety of useful guarantees such as good performance when the target hypothesis is sparse. Until now, there has been no such recipe for the more general OCP problem, and therefore no ability to tune OCP algorithms to take advantage of properties of the problem or data. In this paper we derive a new class of no-regret learning algorithms for OCP. These Lagrangian Hedging algorithms are based on a general class of potential functions, and are a direct generalization of known learning rules like weighted majority and external-regret matching. In addition to proving regret bounds, we demonstrate our algorithms learning to play one-card poker.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

No-regret algorithms for Online Convex Programs

Online convex programming has recently emerged as a powerful primitive for designing machine learning algorithms. For example, OCP can be used for learning a linear classifier, dynamically rebalancing a binary search tree, finding the shortest path in a graph with unknown edge lengths, solving a structured classification problem, or finding a good strategy in an extensive-form game. Several res...

متن کامل

On Fixed Convex Combinations of No-Regret Learners

No-regret algorithms are powerful tools for learning in online convex problems that have received increased attention in recent years. Considering affine and external regret, we investigate what happens when a set of no-regret learners (voters) merge their respective strategies in each learning iteration to a single, common one in form of a convex combination. We show that an agent who executes...

متن کامل

No-Regret Algorithms for Unconstrained Online Convex Optimization

Some of the most compelling applications of online convex optimization, including online prediction and classification, are unconstrained: the natural feasible set is R. Existing algorithms fail to achieve sub-linear regret in this setting unless constraints on the comparator point x̊ are known in advance. We present algorithms that, without such prior knowledge, offer near-optimal regret bounds...

متن کامل

Online Continuous Submodular Maximization

In this paper, we consider an online optimization process, where the objective functions are not convex (nor concave) but instead belong to a broad class of continuous submodular functions. We first propose a variant of the Frank-Wolfe algorithm that has access to the full gradient of the objective functions. We show that it achieves a regret bound of O( √ T ) (where T is the horizon of the onl...

متن کامل

Regret bounds for Non Convex Quadratic Losses Online Learning over Reproducing Kernel Hilbert Spaces

We present several online algorithms with dimension-free regret bounds for general nonconvex quadratic losses by viewing them as functions in Reproducing Hilbert Kernel Spaces. In our work we adapt the Online Gradient Descent, Follow the Regularized Leader and the Conditional Gradient method meta algorithms for RKHS spaces and provide regret bounds in this setting. By analyzing them as algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006